scPipe: a flexible data preprocessing pipeline for single-cell RNA-sequencing data
نویسندگان
چکیده
Single-cell RNA sequencing (scRNA-seq) technology allows researchers to profile the transcriptomes of thousands of cells simultaneously. Protocols that incorporate both designed and random barcodes to label individual cells and molecules have greatly increased the throughput of scRNA-seq, but give rise to a more complex data structure. There is a need for new tools that can handle the various barcodes used by different protocols and exploit this information for quality assessment at the sample-level and provide effective visualization of these results in preparation for higher-level analyses. We developed scPipe, an R package that allows barcode demultiplexing, read alignment and 1
منابع مشابه
A Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملzUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
RNA sequencing is increasingly performed with less starting material and at a higher sample throughput, e.g. to analyse single-cell transcriptomes. In this context, unique molecular identifiers (UMIs) are used to reduce amplification noise and sample-specific barcodes are used to track libraries. Here, we present a fast and flexible pipeline to process data from such RNA-seq protocols. Availabi...
متن کاملAssessing the consistency of public human tissue RNA-seq data sets
Sequencing-based gene expression methods like RNA-sequencing (RNA-seq) have become increasingly common, but it is often claimed that results obtained in different studies are not comparable owing to the influence of laboratory batch effects, differences in RNA extraction and sequencing library preparation methods and bioinformatics processing pipelines. It would be unfortunate if different expe...
متن کاملEngineering a high-performance SNP detection pipeline
We present Sprite, a bioinformatic data analysis pipeline for detecting single nucleotide polymorphisms (SNPs) in the human genome. A SNP detection pipeline for next-generation sequencing data uses several software tools, including tools for read preprocessing, read alignment, and SNP calling. We target end-to-end scalability and I/O efficiency in Sprite by merging tools in this pipeline and el...
متن کامل